Local Skew Correction in Documents

نویسندگان

  • Panagiotis Saragiotis
  • Nikos Papamarkos
چکیده

In this paper we propose a technique for detecting and correcting the skew of text areas in a document. The documents we work with may contain several areas of text with different skew angles. First, a text localization procedure is applied based on connected components analysis. Specifically, the connected components of the document are extracted and filtered according to their size and geometric characteristics. Next, the candidate characters are grouped using a nearest neighbor approach to form words and then based on these words text lines of any skew are constructed. Then, the top-line and baseline for each text line are estimated using linear regression. Text lines in near locations, having similar skew angles, are grown to form text areas. For each text area a local skew angle is estimated and then these text areas are skew corrected independently to horizontal or vertical orientation. The technique has been extensively tested on a variety of document images and its accuracy and robustness is compared with other existing techniques.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Skew Correction of Textural Documents

Two algorithms for accurate skew detection and correction of textual documents are presented. They depend on finding a horizontal RLSA image of the skewed document. The average skew of selected black connected components in the RLSA image is considered as the skew angle for the whole document which is finally rotated in the opposite direction by that amount to obtain the final corrected image. ...

متن کامل

Ultra High Speed Approach for Document Skew Detection and Correction Based On Centre of Gravity

Skew detection and correction (SDC) has a direct effect in efficiency and exactitude of documents’ segmentation and analysis and thus is considered as a very important step in documents’ analysis field. Skew is a major problem in documents’ analysis for every language. For Arabic/Persian document scripts this problem is more severe because of special features of these languages. In this paper a...

متن کامل

Angular Skew Correction Algorithm for Handwritten Hindi Text

In large scale document digitalization of hand written and printed documents, they are scanned and stored in digital form. Since many of these documents are hand written they have errors like angular skewness of words. Skew detection and removal is a part of pre-processing before using OCR software to digitalize the documents. One type of skewness that is most difficult to detect and correct is...

متن کامل

Review on Skew Detection and Correction Techniques

The image obtained after scanning an opened book page usually suffers from various scanning artifacts. One such major artifact is the Skew defect. This defect reduces the quality of the scanned images and cause many problems to the process of document image analysis. It is difficult to understand such documents by the Optical Character Recognizer (OCR). Some effective methods are present to rec...

متن کامل

Skew Correction for Chinese Character using Hough Transform

Chinese Handwritten character recognition is an emerging field in Computer Vision and Pattern Recognition. Documents acquired through Scanner, Mobile or Camera devices are often prone to Skew and Correction of skew for such document is a major task and important factor in optical character recognition. The goal of the work is to correct skew for the documents. In this paper we have proposed a n...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:
  • IJPRAI

دوره 22  شماره 

صفحات  -

تاریخ انتشار 2008